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1 Internet and WWW-based systems: Autonomous visual model building based on 

imag e crawlin g throu gh intern et sea rch engines 
^ Xiaodan Song, Ching-Yung Lin, Ming-Ting Sun 

October 2004 Proceedings of the 6th ACM SIGMM international workshop on 
Multimedia information retrieval MIR '04 

Publisher: ACM Press 

Full text available* pdf(575 15 KB). Additional Information: full citation , a bstract , r eferenc es, citin gs, index 
. _*_ terms 

In this paper, we propose an autonomous learning scheme to automatically build visual 
semantic concept models from the output data of Internet search engines without any 
manual labeling work. First of all, images are gathered by crawling through the Internet 
using a search engine such as Google. Then, we model the search results as "Quasi- 
Positive Bags" in the Multiple-Instance Learning (MIL) framework. We call this generalized 
MIL (GMIL). We propose an algorithm called "Bag K-Means" to find ... 



Keywords: automatic training, content-based image retrieval, cross-modality, image 
crawling, multiple-instance learning, quasi-positive bag, uncertain labeling density 



Data extracti on : Fully a utom atic wrapp e r generati on f o r s earch en gines 

Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Raghavan, Clement Yu 

May 2005 Proceedings of the 14th international conference on World Wide Web 

WWW 'OS 
Publisher: ACM Press 

Full text available: f^lpdfQIS 59 KB). Additional Information: full citation , abstract , refe rences , citings, index 
. \g±u^\ ; terms 

When a query is submitted to a search engine, the search engine returns a dynamically 
generated result page containing the result records, each of which usually consists of a 
link to and/or snippet of a retrieved Web page. In addition, such a result page often also 
contains information irrelevant to the query, such as information related to the hosting 
site of the search engine and advertisements. In this paper, we present a technique for 
automatically producing wrappers that can be used to extr ... 

Keywords: information extraction, search engine, wrapper generation 



3 Guidelines: Designin g search engine user interfaces for the visually impaired 
^ Barbara Leporini, Patrizia Andronico, Marina Buzzi 

May 2004 Proceedings of the 2004 international cross-disciplinary workshop on Web 
accessibility (W4A) W4A '04 
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Publisher: ACM Press 

Full text available: pdf( 500 . 49 KB ) Additional Information: f ull citat i on , abst r act , refe r e n ce s, index terms 

Search engines are a fundamental tool for retrieving specific and appropriate information 
on the Internet; for this reason it is essential for any user to be able to interact with 
simple, clear and accessible interfaces. In this paper we describe the main design issues 
affecting the user interface of a search engine when a sightless user interacts by means of 
a screen reader or voice synthesizer. In particular, the most important differences 
between a visual layout and aural perception are discu ... 

Keywords: Internet, accessibility, search engine, usability, user interface design, web 
navigation 



Visual information retrieval from large distributed online repositories 
Shih-Fu Chang, John R. Smith, Mandis Beigi, Ana Benitez 
December 1997 Communications of the ACM, Volume 40 Issue 12 

Publisher: ACM Press 

Full text available: fijll p d f (1. 96 MB ) Additional Information: full c i tatio n, ref e r e nces, citings, inde x terms 



5 Image II: Analysing the performance of visual, co n cept and text features i n co n tent- 

based video retrieval 
^ Mika Rautiainen, Timo Ojala, Tapio Seppanen 

October 2004 Proceedings of the 6th ACM SIGMM international workshop on 

Multimedia information retrieval MIR '04 
Publisher: ACM Press 

Full text available* jfl pdf(493 48 KB) Additiona! Information: full citation , abstract , references , citings , index 

: t erms 

This paper describes revised content-based search experiments in the context of TRECVID 
2003 benchmark. Experiments focus on measuring content-based video retrieval 
performance with following search cues: visual features, semantic concepts and text. The 
fusion of features uses weights and similarity ranks. Visual similarity is computed using 
Temporal Gradient Correlogram and Temporal Color Correlogram features that are 
extracted from the dynamic content of a video shot. Automatic speech recog ... 

Keywords: Borda count, content-based video retrieval, feature fusion, semantic concepts 



PicASHOW: pi c t o rial au th o ri t y search b y h yp erl i n k s on the web 

January 2002 ACM Transactions on Information Systems (TOIS), volume 20 issue 1 

Publisher: ACM Press 

Full text available* W] pdf(436.32 KB) Adcfitional Information: ful l c itati o n, ab s t r ac t, refere n ce s, i ndex t erms . 
' review 

We describe PicASHOW, a fully automated WWW image retrieval system that is based on 
several link-structure analyzing algorithms. Our basic premise is that a page p displays 
(or links to) an image when the author of p considers the image to be of value to the 
viewers of the page. We thus extend some well known link-based WWW page retrieval 
schemes to the context of image retrieval. PicASHOW's analysis of the link structure 
enables it to retrieve relevant images even when those ... 

Keywords: Image retrieval, hubs and authorities, image hubs, link structure analysis 



Posters: Predicting query difficulty on the web by l earnin g visual clues 

Eric C. Jensen, Steven M. Beitzel, David Grossman, Ophir Frieder, Abdur Chowdhury 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
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Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^ pdf( 224.23 KB ) Additional Information: full citation , abstract , references, index term s 

We describe a method for predicting query difficulty in a precision-oriented web search 
task. Our approach uses visual features from retrieved surrogate document 
representations (titles, snippets, etc.) to predict retrieval effectiveness for a query. By 
training a supervised machine learning algorithm with manually evaluated queries, visual 
clues indicative of relevance are discovered. We show that this approach has a moderate 
correlation of 0.57 with precision at 10 scores from manual relevance ... 

Keywords: query difficulty, web search 



8 Designing search engine user interfaces for the visually impaired 
Barbara Leporini, Patrizia Andronico, Marina Buzzi 

June 2003 ACM SIGCAPH Computers and the Physically Handicapped, issue 76 
Publisher: ACM Press 

Full text available* f* 1 ! pdf(241.11 KB) Additional Information: full citation , abstract , references , citings , index 

terms 

Search engines are a fundamental tool for retrieving specific and appropriate information 
on the Internet; for this reason it is essential for any user to be able to interact with 
simple, clear and accessible interfaces. In this paper we discuss the main differences 
between a visual layout and aural perception, and propose a set of guidelines for search 
engine user interface UIs design. 



Keywords: Internet, accessibility, search engine, usability, user interface, user interface 
design, web navigation 



9 PicASHOW: pictorial authority search by hyperlinks on the Web 

Ronny Lempel, Aya Soffer 
L April 2001 Proceedings of the 10th international conference on World Wide Web 
WWW '01 

Publisher: ACM Press 

Full text available: ^] pdf(633.77 KB) Additional Information: full c it at ion, referen ces , citing s , in dex term s 



Keywords: hubs and authorities, image hubs, image retrieval, link structure analysis 



10 Technical session 15: WWW image retrieval: Hierarchical clustering of WWW image 

search results using visu al , textu al and link information 
^ Deng Cai, Xiaofei He, Zhiwei Li, Wei-Ying Ma, Ji-Rong Wen 

October 2004 Proceedings of the 12th annual ACM international conference on 
Multimedia MULTIMEDIA '04 

Publisher: ACM Press 

Full text available* ^ p<jf(l .1 5 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

We consider the problem of clustering Web image search results. Generally, the image 
search results returned by an image search engine contain multiple topics. Organizing the 
results into different semantic clusters facilitates users' browsing. In this paper, we 
propose a hierarchical clustering method using visual, textual and link analysis. By using a 
vision-based page segmentation algorithm, a web page is partitioned into blocks, and the 
textual and link information of an image can be accu ... 
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^ clustering 

Feng Jing, Changhu Wang, Yuhuan Yao, Kefeng Deng, Lei Zhang, Wei-Ying Ma 
October 2006 Proceedings of the 14th annual ACM international conference on 
Multimedia MULTIMEDIA '06 

Publisher: ACM Press 

Full text available: ^[ pdf(53 6.4 9 K B) Additional Information: full citation, abstra c t , references , index terms 

In this paper, we propose, IGroup, an efficient and effective algorithm that organizes Web 
image search results into clusters. IGroup is different from all existing Web image search 
results clustering algorithms that only cluster the top few images using visual or textual 
features. Our proposed algorithm first identifies several query-related semantic clusters 
based on a key phrases extraction algorithm originally. proposed for clustering general 
Web search results. Then, all the resulting images ... 

Keywords: search result clustering, user interface design, web image search 



12 I m age Retrieval f ro m the World Wi de Web: Issue s , Techniques, a nd S ystems 

^ M. L Kherfi, D. Ziou, A. Bernard! 

March 2004 ACM Computing Surveys (CSUR), volume 36 issue i 

Publisher: ACM Press 

Full text available* Iffl pdf(294 1 3 KB) Additional Information: f ull ci tat i o n, abs t r a c t, ref eren ces, citings, ind ex 
• \m - terms 

With the explosive growth of the World Wide Web, the public is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have 
existed for some years now and' have achieved a certain degree of success. However, 
despite the large number of images available on the Web, image search engines are still 
rare. In this article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



13 VideoQ: an automated content based video search system using visual cues 
Shih-Fu Chang, William Chen, Horace J. Meng, Hari Sundaram, Di Zhong 
November 1997 Proceedings of the fifth ACM international conference on Multimedia 

MULTIMEDIA '97 
Publisher: ACM Press 

Full text available: pdf( 1 . 67 MB ) Additional Information: full c it at i on , ref e renc es, citin gs, index te r ms 




14 Late breaking posters: MetaCrystal: visual interface for meta searching 
Anselm Spoerri 

April 2004 CHI '04 extended abstracts on Human factors in computing systems CHI 
'04 

Publisher: ACM Press 

Full text available: pdf( 77.75 KB ) Additional Information: full citation , abstract , references , index terms 

MetaCrystal visualizes the degree of overlap between the top results returned by different 
search engines. Linked overview tools support rapid exploration, facilitate advanced 
filtering operations and guide users toward relevant information. The direct manipulation 
interface enables users to iteratively compose and edit meta searches. MetaCrystal 
addresses the problem of the effective fusion of different search engine results by helping 
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15 Poster session 1: multimedia retrieval: Visual pattern discovery using web images 
Yongqing Sun, Satoshi Shimada, Masashi Morimoto 

October 2006 Proceedings of the 8th ACM international workshop on Multimedia 
information retrieval MIR '06 

Publisher: ACM Press 

Full text available: pdf(660.39 KB) Additional Information: full citation, abstract , references, index terms 

In this paper, a novel approach for discovering visual patterns associated with semantic 
concepts using web image resources is proposed. This approach can be used to improve 
the performance in image clustering and retrieval, image annotation, and other 
applications such as object recognition. Exploring the rich information in web images that 
represent semantic concepts as both visual content and text information, this research 
attempts to effectively learn intrinsic patterns related to semantic c ... 

Keywords: image clustering and retrieval, unsupervised learning, visual pattern 
discovering, web image mining 



16 Oral session 2: web searching and applications: Probabilistic web image gathering 
^ Keiji Yanai, Kobus Barnard 

V November 2005 Proceedings of the 7th ACM SIGMM international workshop on 

Multimedia information retrieval MIR '05 

Publisher: ACM Press 

Full text available: ^ pdf(922.85 KB) Additional Information: full citation , abstract , references , index terms 

We propose a new method for automated large scale gathering of Web images relevant to 
specified concepts. Our main goal is to build a knowledge base associated with as many 
concepts as possible for large scale object recognition studies. A second goal is supporting 
the building of more accurate text-based indexes for Web images. In our method, good 
quality candidate sets of images for each keyword are gathered as a function of analysis 
of the surrounding HTML text. The gathered images are then s ... 

Keywords: image selection, probabilistic method, web image mining, web image search 



17 Personalization of search engine services for effective retrieval and knowledge 
m anageme n t 

Weiguo Fan, Michael D. Gordon, Praveen Pathak 

December 2000 Proceedings of the twenty first international conference on 
Information systems ICIS 'OO 

Publisher: Association for Information Systems 

Full text available: pdfd 74,07 KB) Additional Information: full citation , references , citings , index terms 
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Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, Abdur Chowdhury, Greg Pass 
August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 

Research and development in information retrieval SIGIR '05 
Publisher: ACM Press 

Full text available: « pdf(325.29 KB) Additional Information: full citation , abstract, references , cjtings, index 

* - terms 

We describe a method for improving the precision of metasearch results based upon 
scoring the visual features of documents' surrogate representations. These surrogate 
scores are used during fusion in place of the original scores or ranks provided by the 
underlying search engines. Visual features are extracted from typical search result 
surrogate information, such as title, snippet, URL, and rank. This approach specifically 
avoids the use of search engine-specific scores and collection statistics ... 
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^ automatic information extraction with visual perceptions 
^ Kai Simon, Georg Lausen 

October 2005 Proceedings of the 14th ACM international conference on Information 
and knowledge management CIKM '05 

Publisher: ACM Press 

Full text available- odf(484 44 KB) Addit ' onal Information: full citation , abstrac t, references , citings, index 
• T§f[_e_a i te rms 

In this paper we address the problem of unsupervised Web data extraction. We show that 
unsupervised Web data extraction becomes feasible when supposing pages that are made 
up of repetitive patterns, as it is the case, e.g., for search engine result pages. Hereby the 
extraction rules are generated automatically without any training or human interaction, by 
means of operating on the DOM tree respectively the flat tag token sequence of a single 
page. Our contribution to automatic data extraction thr ... 

Keywords: data extraction, data record alignment, visual features 



20 De mo n str at ions: visual int erfa c e s: A MORE: a wo rld- wi de we b im a ge re tri e val en gine Q 




Sougata Mukherjea, Kyoji Hirata, Yoshinori Hara 
May 1999 CHI '99 extended abstracts on Human factors in computing systems CHI 



'99 

Publisher: ACM Press 

Full text available: pdf(352.39 KB ) Additional Information: full citation , ab s tract, ref erences , citings 

Advanced Multimedia Oriented Retrieval Engine (AMORE) [2] is a World-Wide Web 
image retrieval engine integrating several techniques to facilitate effective retrieval of 
images from the Web. With the explosive growth of information that is available through 
the WWW, it is becoming increasingly difficult for the users to find the information of 
interest. Therefore, search engines are becoming very popular and useful. However, most 
of the popular search engines today are textual. Although mo ... 

Keywords: clustering, image search, keyword search, query result visualization, world- 
wide web 
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